Skip to content

Conversation

@philnik777
Copy link
Contributor

Currently Clang isn't able to constant fold memchr and wmemchr. This adds some basic constant folding to __constexpr_memchr and __constexpr_wmemchr, mostly alleviating the problem.

While not necessarily representative of the real world, an optimized version of the find.pass.cpp test is reduced from ~13k to ~4.5k lines of assembly.

@philnik777 philnik777 requested a review from a team as a code owner June 24, 2024 14:18
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jun 24, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 24, 2024

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

Currently Clang isn't able to constant fold memchr and wmemchr. This adds some basic constant folding to __constexpr_memchr and __constexpr_wmemchr, mostly alleviating the problem.

While not necessarily representative of the real world, an optimized version of the find.pass.cpp test is reduced from ~13k to ~4.5k lines of assembly.


Full diff: https://github.com/llvm/llvm-project/pull/96491.diff

2 Files Affected:

  • (modified) libcxx/include/__string/constexpr_c_functions.h (+13)
  • (modified) libcxx/include/cwchar (+13)
diff --git a/libcxx/include/__string/constexpr_c_functions.h b/libcxx/include/__string/constexpr_c_functions.h
index a978f816f1897..cd5b63ae47d1e 100644
--- a/libcxx/include/__string/constexpr_c_functions.h
+++ b/libcxx/include/__string/constexpr_c_functions.h
@@ -132,6 +132,19 @@ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp* __constexpr_memchr(_Tp*
   static_assert(sizeof(_Tp) == 1 && __libcpp_is_trivially_equality_comparable<_Tp, _Up>::value,
                 "Calling memchr on non-trivially equality comparable types is unsafe.");
 
+  if (__builtin_constant_p(__count) && __builtin_constant_p(__value)) {
+    for (; __count; --__count) {
+      if (!__builtin_constant_p(*__str))
+        break;
+      if (*__str == __value)
+        return __str;
+      ++__str;
+    }
+  }
+
+  if (__builtin_constant_p(__count) && __count == 0)
+    return nullptr;
+
   if (__libcpp_is_constant_evaluated()) {
 // use __builtin_char_memchr to optimize constexpr evaluation if we can
 #if _LIBCPP_STD_VER >= 17 && __has_builtin(__builtin_char_memchr)
diff --git a/libcxx/include/cwchar b/libcxx/include/cwchar
index 08cfac58c846a..76fdf08988dc5 100644
--- a/libcxx/include/cwchar
+++ b/libcxx/include/cwchar
@@ -231,6 +231,19 @@ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp* __constexpr_wmemchr(_Tp
                     __libcpp_is_trivially_equality_comparable<_Tp, _Tp>::value,
                 "Calling wmemchr on non-trivially equality comparable types is unsafe.");
 
+  if (__builtin_constant_p(__count) && __builtin_constant_p(__value)) {
+    for (; __count; --__count) {
+      if (!__builtin_constant_p(*__str))
+        break;
+      if (*__str == __value)
+        return __str;
+      ++__str;
+    }
+  }
+
+  if (__builtin_constant_p(__count) && __count == 0)
+    return nullptr;
+
 #if __has_builtin(__builtin_wmemchr)
   if (!__libcpp_is_constant_evaluated()) {
     wchar_t __value_buffer = 0;

if (__builtin_constant_p(__count) && __count == 0)
return nullptr;

if (__libcpp_is_constant_evaluated()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get rid of this branch since __builtin_constant_p is always true inside a constant expression. IOW, the code path above is always taken when __libcpp_is_constant_evaluated() is true, so the branch below is basically dead now.

static_assert(sizeof(_Tp) == 1 && __libcpp_is_trivially_equality_comparable<_Tp, _Up>::value,
"Calling memchr on non-trivially equality comparable types is unsafe.");

if (__builtin_constant_p(__count) && __builtin_constant_p(__value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some kind of comment explaining that this is to aid constant-folding and it also handles evaluation inside constant expressions, since that is not super obvious.


if (__libcpp_is_constant_evaluated()) {
// use __builtin_char_memchr to optimize constexpr evaluation if we can
#if _LIBCPP_STD_VER >= 17 && __has_builtin(__builtin_char_memchr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we file a bug report against Clang to complain that __builtin_memchr doesn't constant-fold? It seems like an obvious QOI issue since Clang can already constant-evaluate that function.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From live review: It seems like Clang only fails to constant fold this for a local array. Since that's the case, IMO it would make sense to file a bug against the optimizer instead of working around it in the library.

@philnik777
Copy link
Contributor Author

I've submitted #114668 and #100929 to track the two issues.

@philnik777 philnik777 closed this Nov 2, 2024
@philnik777 philnik777 deleted the constant_fold_wmemchr branch December 17, 2024 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants